Skip to content

Extend as JSON schema to include list[Any]#12

Merged
giograno merged 2 commits into
mainfrom
extend-as-json-schema
Dec 5, 2025
Merged

Extend as JSON schema to include list[Any]#12
giograno merged 2 commits into
mainfrom
extend-as-json-schema

Conversation

@giograno
Copy link
Copy Markdown
Member

@giograno giograno commented Dec 4, 2025

This framework has the feature of reducing specific types (dict[str, Any] and list[dict[str, Any]) to a string (or bytes), assuming the fields are dumped to JSON.

With this PR, we also extend the same behavior to list[Any], as we did find many occurrences in LocalStack when this would be very useful!

@giograno giograno requested a review from bentsku December 4, 2025 08:27
@giograno giograno self-assigned this Dec 4, 2025
Copy link
Copy Markdown

@bentsku bentsku left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this looks good, just added some question/comments but I might have missed something. If that's the case I'll just approve, I'm happy to be contradicted here 😄

@register_schema
class DictAsJSONSchema(Schema):
"""An Avro string schema representing a Python Dict[str, Any] or List[Dict[str, Any]] assuming JSON serialization"""
"""
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: should we rename the DictAsJSONSchema to something like TypeAsJSONSchema to indicates it can map more?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good suggestion! Done it.


def is_logically_json(py_type: Type) -> bool:
"""Returns whether a given type is logically a JSON and can be serialized as such"""
return _is_list_any(py_type) or _is_list_dict_str_any(py_type) or _is_dict_str_any(py_type)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: is it worth keeping the 3 different functions as they are? Here, if we're to match the last condition, we will call origin = get_origin(py_type) three times. Would it be worth to call those things only once and pass it to those functions? Or are they used elsewhere?

If they're not used, I could see them being merged into one and we could use recursion on them?

Something like:

def is_logically_json(py_type: Type) -> bool:
    origin = get_origin(py_type)
    if not inspect.isclass(origin):
        return False

    args = get_args(py_type)
    if issubclass(origin, list):
        if not args:
            return False
        elif args == (Any,):
            return True
        return is_logically_json(args[0])

    elif issubclass(origin, dict):
        return args == (str, Any)

Or something like that? I'm not sure it's really clearer, with some comments it would be better but at least we're calling the methods only once every time and we can add more case more easily, and the recursion would handle all cases of "logically JSON"? it might be the case already though

I feel like this would match the Dict[str, List[Any]] for example which I'm not sure currently it would work

Copy link
Copy Markdown

@bentsku bentsku left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, after discussing offline the recursion method is more permissive and we'd like to have more control over that.

I think if we need to add another check in is_logically_json we should look into refactoring that code, as we're calling the same methods over py_type a lot of type, and we could probably improve that, but for now that's good 👍

@giograno giograno merged commit fd4ec45 into main Dec 5, 2025
6 checks passed
@giograno giograno deleted the extend-as-json-schema branch December 5, 2025 07:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants